{
  "cells": [
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "# Introduction\n\nIn this notebook, we demonstrate the steps needed to create an IoT Edge deployable module from the regression model created in the [turbofan regression](./turbofan_regression.ipynb) notebook. The steps we will follow are:\n   1. Reload experiment and model from the Azure Machine Learning service workspace\n   1. Create a scoring script\n   1. Create an environment YAML file\n   1. Create a container image using the model, scoring script and YAML file\n   1. Deploy the container image as a web service \n   1. Test the web service to make sure the container works as expected\n   1. Delete the web service\n   \n><font color=gray>Note: this notebook depends on the workspace, experiment and model created in the [turbofan regression](./turbofan_regression.ipynb) notebook.</font>"
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "# Set up notebook\n\nPlease ensure that you are running this notebook under the Python 3.6 Kernel. The current kernel is show on the top of the notebook at the far right side of the file menu. If you are not running Python 3.6 you can change it in the file menu by clicking **Kernel->Change Kernel->Python 3.6**"
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "from IPython.core.interactiveshell import InteractiveShell\nInteractiveShell.ast_node_interactivity = \"all\"\n\n%matplotlib inline",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Configure workspace\n\nCreate a workspace object from the existing workspace. `Workspace.from_config()` reads the file **aml_config/.azureml/config.json** and loads the details into an object named `ws`, which is used throughout the rest of the code in this notebook."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "from azureml.core.workspace import Workspace\nfrom azureml.core.experiment import Experiment\nfrom azureml.core.model import Model\nfrom azureml.train.automl.run import AutoMLRun\n\nws = Workspace.from_config(path='./aml_config')",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Load run, experiment and model\n\nUse the model information that we persisted in the [turbofan regression](./turbofan_regression.ipynb) noebook to load our model."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "import json \n\n#name project folder and experiment\nmodel_data = json.load(open('./aml_config/model_config.json'))\n\nrun_id = model_data['regressionRunId']\nexperiment_name = model_data['experimentName']\nmodel_id = model_data['modelId']\n\nexperiment = Experiment(ws, experiment_name)\nautoml_run = AutoMLRun(experiment = experiment, run_id = run_id)\nmodel = Model(ws, model_id)",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "# Create scoring script\n\nThe scoring script is the piece of code that runs inside the container and interacts with the model to return a prediction to the caller of web service or Azure IoT Edge module that is running the container. The scoring script is authored knowing the shape of the message that will be sent to the container. In our case, we have chosen to format the message as:\n\n```json\n[{\n    \"DeviceId\": 81,\n    \"CycleTime\": 140,\n    \"OperationalSetting1\": 0.0,\n    \"OperationalSetting2\": -0.0002,\n    \"OperationalSetting3\": 100.0,\n    \"Sensor1\": 518.67,\n    \"Sensor2\": 642.43,\n    \"Sensor3\": 1596.02,\n    \"Sensor4\": 1404.4,\n    \"Sensor5\": 14.62,\n    \"Sensor6\": 21.6,\n    \"Sensor7\": 559.76,\n    \"Sensor8\": 2388.19,\n    \"Sensor9\": 9082.16,\n    \"Sensor10\": 1.31,\n    \"Sensor11\": 47.6,\n    \"Sensor12\": 527.82,\n    \"Sensor13\": 2388.17,\n    \"Sensor14\": 8155.92,\n    \"Sensor15\": 8.3214,\n    \"Sensor16\": 0.03,\n    \"Sensor17\": 393.0,\n    \"Sensor18\": 2388.0,\n    \"Sensor19\": 100.0,\n    \"Sensor20\": 39.41,\n    \"Sensor21\": 23.5488\n}]\n```"
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "script_file_name = 'score.py'",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "%%writefile $script_file_name\nimport pickle\nimport json\nimport numpy as np\nimport pandas as pd\nimport azureml.train.automl\nfrom sklearn.externals import joblib\nfrom azureml.core.model import Model\n\ndef init():\n    global model\n    model_path = Model.get_model_path(model_name = '<<modelname>>')\n    # deserialize the model file back into a sklearn model\n    model = joblib.load(model_path)\n    \ndef unpack_message(raw_data):\n    message_data = json.loads(raw_data)\n    # convert single message to list \n    if type(message_data) is dict:\n        message_data = [message_data]\n    return message_data\n    \ndef extract_features(message_data):\n    X_data = []\n    sensor_names = ['Sensor'+str(i) for i in range(1,22)]\n    \n    for message in message_data:\n        # select sensor data from the message dictionary\n        feature_dict = {k: message[k] for k in (sensor_names)}\n        X_data.append(feature_dict)\n    \n    X_df = pd.DataFrame(X_data)\n    return np.array(X_df[sensor_names].values)\n\ndef append_predict_data(message_data, y_hat):\n    message_df = pd.DataFrame(message_data)\n    message_df['PredictedRul'] = y_hat\n    return message_df.to_dict('records')\n\ndef log_for_debug(log_message, log_data):\n    print(\"*****%s:\" % log_message)\n    print(log_data)\n    print(\"******\")\n\ndef run(raw_data):\n    log_for_debug(\"raw_data\", raw_data)\n    \n    message_data = unpack_message(raw_data)\n    log_for_debug(\"message_data\", message_data)\n    \n    X_data = extract_features(message_data)\n    log_for_debug(\"X_data\", X_data)\n   \n    # make prediction\n    y_hat = model.predict(X_data)\n    \n    response_data = append_predict_data(message_data, y_hat)\n    return response_data",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "### Update the scoring script with the actual model ID"
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "# Substitute the actual model id in the script file.\n\nwith open(script_file_name, 'r') as cefr:\n    content = cefr.read()\n\nwith open(script_file_name, 'w') as cefw:\n    cefw.write(content.replace('<<modelname>>', model.name))",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Create YAML file for the environment\n\nThe YAML file provides the information about the dependencies for the model we will deploy. \n\n### Get azureml versions\n\nFirst we will use the run to retrieve the version of the azureml packages used to train the model. \n\n>Warnings about the version of the SDK not matching with the training version are expected "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "best_run, fitted_model = automl_run.get_output()\niteration = int(best_run.get_properties()['iteration'])\ndependencies = automl_run.get_run_sdk_dependencies(iteration = iteration)\nfor p in ['azureml-train-automl', 'azureml-sdk', 'azureml-core']:\n    print('{}\\t{}'.format(p, dependencies[p]))",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "### Write YAML file \n\nWrite the initial YAML file to disk and update the dependencies for azureml to match with the training versions. This is not strictly needed in this notebook because the model likely has been generated using the current SDK version. However, we include this for completeness for the case when an experiment was trained using a previous SDK version."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "import azureml.core\nfrom azureml.core.conda_dependencies import CondaDependencies\n\nmyenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn','pandas','psutil'], pip_packages=['azureml-sdk[automl]'])\n\nconda_env_file_name = 'myenv.yml'\nmyenv.save_to_file('.', conda_env_file_name)\n\n# Substitute the actual version number in the environment file.\nwith open(conda_env_file_name, 'r') as cefr:\n    content = cefr.read()\n\nwith open(conda_env_file_name, 'w') as cefw:\n    cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-sdk']))",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Create a container image\n\nUse the scoring script and the YAML file to create a container image in the workspace. The image will take several minutes to create."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "from azureml.core.image import Image, ContainerImage\n\nimage_config = ContainerImage.image_configuration(runtime= \"python\",\n                                 execution_script = script_file_name,\n                                 conda_file = conda_env_file_name,\n                                 tags = {'area': \"digits\", 'type': \"automl_classification\"},\n                                 description = \"Image for Edge ML samples\")\n\nimage = Image.create(name = \"edgemlsample\",\n                     # this is the model object \n                     models = [model],\n                     image_config = image_config, \n                     workspace = ws)\n\nimage.wait_for_creation(show_output = True)\n\nif image.creation_state == 'Failed':\n    print(\"Image build log at: \" + image.image_build_log_uri)",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Deploy image as a web service on Azure Container Instance\n\nDeploy the image we just created as web service on Azure Container Instance (ACI). We will use this web service to test that our model/container performs as expected. "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "from azureml.core.webservice import AciWebservice\nfrom azureml.core.webservice import Webservice\naci_service_name = 'edge-ml-rul-01'\n\n\naci_config = AciWebservice.deploy_configuration(cpu_cores = 1, \n                                               memory_gb = 1, \n                                               tags = {'area': \"digits\", 'type': \"automl_RUL\"}, \n                                               description = 'test service for Edge ML RUL')\n\nprint (\"Deploying service: %s\" % aci_service_name)\n\naci_service = Webservice.deploy_from_image(deployment_config = aci_config,\n                                           image = image,\n                                           name = aci_service_name,\n                                           workspace = ws)\n\naci_service.wait_for_deployment(True)\nprint (\"Service state: %s\" % aci_service.state)",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Load test data\n\nTo save a couple of steps at this point, we serialized the test data that we loaded in the [turbofan regression](./turbofan_regression.ipynb) notebook. Here we deserialize that data to use it to test the web service."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "import pandas as pd\nfrom sklearn.externals import joblib\nimport numpy\n\ntest_df = pd.read_csv(\"data/WebServiceTest.csv\")\n\ntest_df.head(5)",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Predict one message at a time\n\nOnce the container/model is deployed to and Azure IoT Edge device it will receive messages one at a time. Send a few messages in that mode to make sure everything is working."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "import json\nimport pandas as pd\n\n# reformat data as list of messages\nX_message = test_df.head(5).to_dict('record')\n\nresult_list = []\nfor row in X_message:\n    row_data = json.dumps(row)\n    row_result = aci_service.run(input_data=row_data)\n    result_list.append(row_result[0])\n\nresult_df = pd.DataFrame(result_list)\nresiduals = result_df['RUL'] - result_df['PredictedRul']\nresult_df['Residual'] = residuals\nresult_df[['CycleTime','RUL','PredictedRul','Residual']]\n",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Predict entire set\n\nTo make sure the model as a whole is working as expected, we send the test set in bulk to the model, save the predictions, and calculate the residual."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "import json\nimport pandas as pd\n\nX_messages = test_df.to_dict('record')\nraw_data = json.dumps(X_messages)\n\nresult_list = aci_service.run(input_data=raw_data)\nresult_df = pd.DataFrame(result_list)\nresiduals = result_df['RUL'] - result_df['PredictedRul']\nresult_df['Residual'] = residuals\n\ny_test = result_df['RUL']\ny_pred = result_df['PredictedRul']",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Plot actuals vs. predicted\n\nTo validate the shape of the model, plot the actual RUL against the predicted RUL for each cycle and device."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "from sklearn.metrics import mean_squared_error, r2_score\nimport matplotlib.pyplot as plt\nimport seaborn as sns\nsns.set()\n\nfig, ax = plt.subplots()\nfig.set_size_inches(8, 4)\n\nfont_size = 12\n\ng = sns.regplot(y='PredictedRul', x='RUL', data=result_df, fit_reg=False, ax=ax)\nlim_set = g.set(ylim=(0, 500), xlim=(0, 500))\nplot = g.axes.plot([0, 500], [0, 500], c=\".3\", ls=\"--\");\n\nrmse = ax.text(16,450,'RMSE = {0:.2f}'.format(numpy.sqrt(mean_squared_error(y_test, y_pred))), fontsize = font_size)\nr2 = ax.text(16,425,'R2 Score = {0:.2f}'.format(r2_score(y_test, y_pred)), fontsize = font_size)\n\nxlabel = ax.set_xlabel('Actual RUL', size=font_size)\nylabel = ax.set_ylabel('Predicted RUL', size=font_size)\n",
      "execution_count": null,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Delete web service\n\nNow that we are confident that our container and model are working well, delete the web service."
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "from azureml.core.webservice import Webservice\naci_service = Webservice(ws, 'edge-ml-rul-01')\n\naci_service.delete()",
      "execution_count": null,
      "outputs": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "name": "python36",
      "display_name": "Python 3.6",
      "language": "python"
    },
    "language_info": {
      "mimetype": "text/x-python",
      "nbconvert_exporter": "python",
      "name": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.6",
      "file_extension": ".py",
      "codemirror_mode": {
        "version": 3,
        "name": "ipython"
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 1
}
